Statistical inference using SGD

نویسندگان

Tianyang Li

Liu Liu

Anastasios Kyrillidis

Constantine Caramanis

چکیده

We present a novel method for frequentist statistical inference in M -estimation problems, based on stochastic gradient descent (SGD) with a fixed step size: we demonstrate that the average of such SGD sequences can be used for statistical inference, after proper scaling. An intuitive analysis using the OrnsteinUhlenbeck process suggests that such averages are asymptotically normal. From a practical perspective, our SGD-based inference procedure is a first order method, and is well-suited for large scale problems. To show its merits, we apply it to both synthetic and real datasets, and demonstrate that its accuracy is comparable to classical statistical methods, while requiring potentially far less computation.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Statistical Inference for Online Learning and Stochastic Approximation via Hierarchical Incremental Gradient Descent

Stochastic gradient descent (SGD) is an immensely popular approach for online learningin settings where data arrives in a stream or data sizes are very large. However, despite anever-increasing volume of work on SGD, much less is known about the statistical inferentialproperties of SGD-based predictions. Taking a fully inferential viewpoint, this paper introducesa novel proc...

متن کامل

On Scalable Inference with Stochastic Gradient Descent

In many applications involving large dataset or online updating, stochastic gradient descent (SGD) provides a scalable way to compute parameter estimates and has gained increasing popularity due to its numerical convenience and memory efficiency. While the asymptotic properties of SGD-based estimators have been established decades ago, statistical inference such as interval estimation remains m...

متن کامل

A Variational Analysis of Stochastic Gradient Algorithms

Stochastic Gradient Descent (SGD) is an important algorithm in machine learning. With constant learning rates, it is a stochastic process that, after an initial phase of convergence, generates samples from a stationary distribution. We show that SGD with constant rates can be effectively used as an approximate posterior inference algorithm for probabilistic modeling. Specifically, we show how t...

متن کامل

Inference, Converges to Limit Cycles for Deep Networks

Stochastic gradient descent (SGD) is widely believed to perform implicit regularization when used to train deep neural networks, but the precise manner in which this occurs has thus far been elusive. We prove that SGD minimizes an average potential over the posterior distribution of weights along with an entropic regularization term. This potential is however not the original loss function in g...

متن کامل